Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.802
Filtrar
1.
Nature ; 627(8003): 424-430, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38418874

RESUMO

Mycobacterium tuberculosis (Mtb) is a bacterial pathogen that causes tuberculosis (TB), an infectious disease that is responsible for major health and economic costs worldwide1. Mtb encounters diverse environments during its life cycle and responds to these changes largely by reprogramming its transcriptional output2. However, the mechanisms of Mtb transcription and how they are regulated remain poorly understood. Here we use a sequencing method that simultaneously determines both termini of individual RNA molecules in bacterial cells3 to profile the Mtb transcriptome at high resolution. Unexpectedly, we find that most Mtb transcripts are incomplete, with their 5' ends aligned at transcription start sites and 3' ends located 200-500 nucleotides downstream. We show that these short RNAs are mainly associated with paused RNA polymerases (RNAPs) rather than being products of premature termination. We further show that the high propensity of Mtb RNAP to pause early in transcription relies on the binding of the σ-factor. Finally, we show that a translating ribosome promotes transcription elongation, revealing a potential role for transcription-translation coupling in controlling Mtb gene expression. In sum, our findings depict a mycobacterial transcriptome that prominently features incomplete transcripts resulting from RNAP pausing. We propose that the pausing phase constitutes an important transcriptional checkpoint in Mtb that allows the bacterium to adapt to environmental changes and could be exploited for TB therapeutics.


Assuntos
Regulação Bacteriana da Expressão Gênica , Mycobacterium tuberculosis , RNA Bacteriano , Transcriptoma , RNA Polimerases Dirigidas por DNA/metabolismo , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , RNA Bacteriano/análise , RNA Bacteriano/biossíntese , RNA Bacteriano/genética , Transcriptoma/genética , Tuberculose/microbiologia , RNA Mensageiro/análise , RNA Mensageiro/biossíntese , RNA Mensageiro/genética , Sítio de Iniciação de Transcrição , Fator sigma/metabolismo , Ribossomos/metabolismo , Biossíntese de Proteínas
2.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38407414

RESUMO

MOTIVATION: Prediction and identification of core promoter elements and transcription factor binding sites is essential for understanding the mechanism of transcription initiation and deciphering the biological activity of a specific locus. Thus, there is a need for an up-to-date tool to detect and curate core promoter elements/motifs in any provided nucleotide sequences. RESULTS: Here, we introduce ElemeNT 2023-a new and enhanced version of the Elements Navigation Tool, which provides novel capabilities for assessing evolutionary conservation and for readily evaluating the quality of high-throughput transcription start site (TSS) datasets, leveraging preferential motif positioning. ElemeNT 2023 is accessible both as a fast web-based tool and via command line (no coding skills are required to run the tool). While this tool is focused on core promoter elements, it can also be used for searching any user-defined motif, including sequence-specific DNA binding sites. Furthermore, ElemeNT's CORE database, which contains predicted core promoter elements around annotated TSSs, is now expanded to cover 10 species, ranging from worms to human. In this applications note, we describe the new workflow and demonstrate a case study using ElemeNT 2023 for core promoter composition analysis of diverse species, revealing motif prevalence and highlighting evolutionary insights. We discuss how this tool facilitates the exploration of uncharted transcriptomic data, appraises TSS quality, and aids in designing synthetic promoters for gene expression optimization. Taken together, ElemeNT 2023 empowers researchers with comprehensive tools for meticulous analysis of sequence elements and gene expression strategies. AVAILABILITY AND IMPLEMENTATION: ElemeNT 2023 is freely available at https://www.juven-gershonlab.org/resources/element-v2023/. The source code and command line version of ElemeNT 2023 are available at https://github.com/OritAdato/ElemeNT. No coding skills are required to run the tool.


Assuntos
Software , Humanos , Regiões Promotoras Genéticas , Ligação Proteica , Sítio de Iniciação de Transcrição
3.
Int J Mol Sci ; 25(3)2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38338773

RESUMO

Since the discovery of physical peculiarities around transcription start sites (TSSs) and a site corresponding to the TATA box, research has revealed only the average features of these sites. Unsettled enigmas include the individual genes with these features and whether they relate to gene function. Herein, using 10 physical properties of DNA, including duplex DNA free energy, base stacking energy, protein-induced deformability, and stabilizing energy of Z-DNA, we clarified for the first time that approximately 97% of the promoters of 21,056 human protein-coding genes have distinctive physical properties around the TSS and/or position -27; of these, nearly 65% exhibited such properties at both sites. Furthermore, about 55% of the 21,056 genes had a minimum value of regional duplex DNA free energy within TSS-centered ±300 bp regions. Notably, distinctive physical properties within the promoters and free energies of the surrounding regions separated human protein-coding genes into five groups; each contained specific gene ontology (GO) terms. The group represented by immune response genes differed distinctly from the other four regarding the parameter of the free energies of the surrounding regions. A vital suggestion from this study is that physical-feature-based analyses of genomes may reveal new aspects of the organization and regulation of genes.


Assuntos
DNA , Humanos , Regiões Promotoras Genéticas , TATA Box/genética , Sítio de Iniciação de Transcrição
4.
G3 (Bethesda) ; 14(3)2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38253712

RESUMO

Transcriptional initiation is among the first regulated steps controlling eukaryotic gene expression. High-throughput profiling of fungal and animal genomes has revealed that RNA Polymerase II often initiates transcription in both directions at the promoter transcription start site, but generally only elongates productively into the gene body. Additionally, Pol II can initiate transcription in both directions at cis-regulatory elements such as enhancers. These bidirectional RNA Polymerase II initiation events can be observed directly with methods that capture nascent transcripts, and they are also revealed indirectly by the presence of transcription-associated histone modifications on both sides of the transcription start site or cis-regulatory elements. Previous studies have shown that nascent RNAs and transcription-associated histone modifications in the model plant Arabidopsis thaliana accumulate mainly in the gene body, suggesting that transcription does not initiate widely in the upstream direction from genes in this plant. We compared transcription-associated histone modifications and nascent transcripts at both transcription start sites and cis-regulatory elements in A. thaliana, Drosophila melanogaster, and Homo sapiens. Our results provide evidence for mostly unidirectional RNA Polymerase II initiation at both promoters and gene-proximal cis-regulatory elements of A. thaliana, whereas bidirectional transcription initiation is observed widely at promoters in both D. melanogaster and H. sapiens, as well as cis-regulatory elements in Drosophila. Furthermore, the distribution of transcription-associated histone modifications around transcription start sites in the Oryza sativa (rice) and Glycine max (soybean) genomes suggests that unidirectional transcription initiation is the norm in these genomes as well. These results suggest that there are fundamental differences in transcriptional initiation directionality between flowering plant and metazoan genomes, which are manifested as distinct patterns of chromatin modifications around RNA polymerase initiation sites.


Assuntos
Arabidopsis , Cromatina , Animais , Cromatina/genética , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Transcrição Gênica , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Sítio de Iniciação de Transcrição , Plantas/genética
5.
Nat Struct Mol Biol ; 31(1): 190-202, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38177677

RESUMO

Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.


Assuntos
Polifosfatos , RNA Polimerase II , Saccharomyces cerevisiae , RNA Polimerase II/metabolismo , Saccharomyces cerevisiae/metabolismo , Sequência de Bases , Sítio de Iniciação de Transcrição , Nucleosídeos , Transcrição Gênica , Guanosina Trifosfato
6.
Nucleic Acids Res ; 52(D1): D322-D333, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37956335

RESUMO

Transposable elements (TEs) are abundant in the genome and serve as crucial regulatory elements. Some TEs function as epigenetically regulated promoters, and these TE-derived transcription start sites (TSSs) play a crucial role in regulating genes associated with specific functions, such as cancer and embryogenesis. However, the lack of an accessible database that systematically gathers TE-derived TSS data is a current research gap. To address this, we established TE-TSS, an integrated data resource of human and mouse TE-derived TSSs (http://xozhanglab.com/TETSS). TE-TSS has compiled 2681 RNA sequencing datasets, spanning various tissues, cell lines and developmental stages. From these, we identified 5768 human TE-derived TSSs and 2797 mouse TE-derived TSSs, with 47% and 38% being experimentally validated, respectively. TE-TSS enables comprehensive exploration of TSS usage in diverse samples, providing insights into tissue-specific gene expression patterns and transcriptional regulatory elements. Furthermore, TE-TSS compares TE-derived TSS regions across 15 mammalian species, enhancing our understanding of their evolutionary and functional aspects. The establishment of TE-TSS facilitates further investigations into the roles of TEs in shaping the transcriptomic landscape and offers valuable resources for comprehending their involvement in diverse biological processes.


Assuntos
Elementos de DNA Transponíveis , Bases de Dados Genéticas , Sequências Reguladoras de Ácido Nucleico , Sítio de Iniciação de Transcrição , Animais , Humanos , Camundongos , Elementos de DNA Transponíveis/genética , Mamíferos/genética , Regiões Promotoras Genéticas , Análise de Sequência de RNA , Internet
7.
Nucleic Acids Res ; 52(2): e7, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-37994784

RESUMO

Precise detection of the transcriptional start site (TSS) is a key for characterizing transcriptional regulation of genes and for annotation of newly sequenced genomes. Here, we describe the development of an improved method, designated 'TSS-seq2.' This method is an iterative improvement of TSS-seq, a previously published enzymatic cap-structure conversion method to detect TSSs in base sequences. By modifying the original procedure, including by introducing split ligation at the key cap-selection step, the yield and the accuracy of the reaction has been substantially improved. For example, TSS-seq2 can be conducted using as little as 5 ng of total RNA with an overall accuracy of 96%; this yield a less-biased and more precise detection of TSS. We then applied TSS-seq2 for TSS analysis of four plant species that had not yet been analyzed by any previous TSS method.


Assuntos
Análise de Sequência de RNA , Sítio de Iniciação de Transcrição , Sequência de Bases , Regulação da Expressão Gênica , Regiões Promotoras Genéticas , Análise de Sequência de RNA/métodos
8.
Nat Struct Mol Biol ; 30(12): 1970-1984, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37996663

RESUMO

Global changes in transcriptional regulation and RNA metabolism are crucial features of cancer development. However, little is known about the role of the core promoter in defining transcript identity and post-transcriptional fates, a potentially crucial layer of transcriptional regulation in cancer. In this study, we use CAGE-seq analysis to uncover widespread use of dual-initiation promoters in which non-canonical, first-base-cytosine (C) transcription initiation occurs alongside first-base-purine initiation across 59 human cancers and healthy tissues. C-initiation is often followed by a 5' terminal oligopyrimidine (5'TOP) sequence, dramatically increasing the range of genes potentially subjected to 5'TOP-associated post-transcriptional regulation. We show selective, dynamic switching between purine and C-initiation site usage, indicating transcription initiation-level regulation in cancers. We additionally detail global metabolic changes in C-initiation transcripts that mark differentiation status, proliferative capacity, radiosensitivity, and response to irradiation and to PI3K-Akt-mTOR and DNA damage pathway-targeted radiosensitization therapies in colorectal cancer organoids and cancer cell lines and tissues.


Assuntos
Fosfatidilinositol 3-Quinases , RNA , Humanos , Sítio de Iniciação de Transcrição , RNA/genética , Proliferação de Células , Purinas
9.
PLoS Comput Biol ; 19(11): e1011491, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37983292

RESUMO

Core promoters are stretches of DNA at the beginning of genes that contain information that facilitates the binding of transcription initiation complexes. Different functional subsets of genes have core promoters with distinct architectures and characteristic motifs. Some of these motifs inform the selection of transcription start sites (TSS). By discovering motifs with fixed distances from known TSS positions, we could in principle classify promoters into different functional groups. Due to the variability and overlap of architectures, promoter classification is a difficult task that requires new approaches. In this study, we present a new method based on non-negative matrix factorisation (NMF) and the associated software called seqArchR that clusters promoter sequences based on their motifs at near-fixed distances from a reference point, such as TSS. When combined with experimental data from CAGE, seqArchR can efficiently identify TSS-directing motifs, including known ones like TATA, DPE, and nucleosome positioning signal, as well as novel lineage-specific motifs and the function of genes associated with them. By using seqArchR on developmental time courses, we reveal how relative use of promoter architectures changes over time with stage-specific expression. seqArchR is a powerful tool for initial genome-wide classification and functional characterisation of promoters. Its use cases are more general: it can also be used to discover any motifs at near-fixed distances from a reference point, even if they are present in only a small subset of sequences.


Assuntos
Algoritmos , Software , Regiões Promotoras Genéticas/genética , Sítio de Iniciação de Transcrição , Nucleossomos
10.
Nat Commun ; 14(1): 7240, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945584

RESUMO

Five-prime single-cell RNA-seq (scRNA-seq) has been widely employed to profile cellular transcriptomes, however, its power of analysing transcription start sites (TSS) has not been fully utilised. Here, we present a computational method suite, CamoTSS, to precisely identify TSS and quantify its expression by leveraging the cDNA on read 1, which enables effective detection of alternative TSS usage. With various experimental data sets, we have demonstrated that CamoTSS can accurately identify TSS and the detected alternative TSS usages showed strong specificity in different biological processes, including cell types across human organs, the development of human thymus, and cancer conditions. As evidenced in nasopharyngeal cancer, alternative TSS usage can also reveal regulatory patterns including systematic TSS dysregulations.


Assuntos
Neoplasias Nasofaríngeas , Humanos , Sítio de Iniciação de Transcrição , Análise da Expressão Gênica de Célula Única , Transcriptoma/genética , Fenótipo , Análise de Célula Única/métodos
12.
J Virol ; 97(9): e0081823, 2023 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-37681957

RESUMO

HIV-1 uses heterogeneous transcription start sites (TSSs) to generate two RNA 5´ isoforms that adopt radically different structures and perform distinct replication functions. Although these RNAs differ in length by only two bases, exclusively, the shorter RNA is encapsidated while the longer RNA is excluded from virions and provides intracellular functions. The current study examined TSS usage and packaging selectivity for a broad range of retroviruses and found that heterogeneous TSS usage was a conserved feature of all tested HIV-1 strains, but all other retroviruses examined displayed unique TSSs. Phylogenetic comparisons and chimeric viruses' properties provided evidence that this mechanism of RNA fate determination was an innovation of the HIV-1 lineage, with determinants mapping to core promoter elements. Fine-tuning differences between HIV-1 and HIV-2, which uses a unique TSS, implicated purine residue positioning plus a specific TSS-adjacent dinucleotide in specifying multiplicity of TSS usage. Based on these findings, HIV-1 expression constructs were generated that differed from the parental strain by only two point mutations yet each expressed only one of HIV-1's two RNAs. Replication defects of the variant with only the presumptive founder TSS were less severe than those for the virus with only the secondary start site. IMPORTANCE Retroviruses use RNA both to encode their proteins and to serve in place of DNA as their genomes. A recent surprising discovery was that the genomic RNAs and messenger RNAs of HIV-1 are not identical but instead differ subtly on one of their ends. These differences enable the functional separation of HIV-1 RNAs into genome and messenger roles. In this report, we examined a broad collection of HIV-1-related viruses and discovered that each produced only one end class of RNA, and thus must differ from HIV-1 in how they specify RNA fates. By comparing regulatory signals, we generated virus variants that pinpointed the determinants of HIV-1 RNA fates, as well as HIV-1 variants that produced only one or the other functional class of RNA. Competition and replication assays confirmed that HIV-1 has evolved to rely on the coordinated actions of both its RNA forms.


Assuntos
HIV-1 , RNA Viral , Sítio de Iniciação de Transcrição , HIV-1/genética , Filogenia , Retroviridae/genética , Regiões Promotoras Genéticas , RNA Viral/genética
13.
Nature ; 622(7981): 173-179, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37731000

RESUMO

Lysine residues in histones and other proteins can be modified by post-translational modifications that encode regulatory information1. Lysine acetylation and methylation are especially important for regulating chromatin and gene expression2-4. Pathways involving these post-translational modifications are targets for clinically approved therapeutics to treat human diseases. Lysine methylation and acetylation are generally assumed to be mutually exclusive at the same residue. Here we report cellular lysine residues that are both methylated and acetylated on the same side chain to form Nε-acetyl-Nε-methyllysine (Kacme). We show that Kacme is found on histone H4 (H4Kacme) across a range of species and across mammalian tissues. Kacme is associated with marks of active chromatin, increased transcriptional initiation and is regulated in response to biological signals. H4Kacme can be installed by enzymatic acetylation of monomethyllysine peptides and is resistant to deacetylation by some HDACs in vitro. Kacme can be bound by chromatin proteins that recognize modified lysine residues, as we demonstrate with the crystal structure of acetyllysine-binding protein BRD2 bound to a histone H4Kacme peptide. These results establish Kacme as a cellular post-translational modification with the potential to encode information distinct from methylation and acetylation alone and demonstrate that Kacme has all the hallmarks of a post-translational modification with fundamental importance to chromatin biology.


Assuntos
Acetilação , Cromatina , Lisina , Metilação , Processamento de Proteína Pós-Traducional , Sítio de Iniciação de Transcrição , Animais , Humanos , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Histonas/química , Histonas/metabolismo , Lisina/análogos & derivados , Lisina/química , Lisina/metabolismo , Peptídeos/química , Peptídeos/metabolismo , Histona Desacetilases/metabolismo
14.
BMC Microbiol ; 23(1): 243, 2023 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-37653502

RESUMO

Analysis of genome wide transcription start sites (TSSs) revealed an unexpected complexity since not only canonical TSS of annotated genes are recognized by RNA polymerase. Non-canonical TSS were detected antisense to, or within, annotated genes as well new intergenic (orphan) TSS, not associated with known genes. Previously, it was hypothesized that many such signals represent noise or pervasive transcription, not associated with a biological function. Here, a modified Cappable-seq protocol allows determining the primary transcriptome of the enterohemorrhagic E. coli O157:H7 EDL933 (EHEC). We used four different growth media, both in exponential and stationary growth phase, replicated each thrice. This yielded 19,975 EHEC canonical and non-canonical TSS, which reproducibly occurring in three biological replicates. This questions the hypothesis of experimental noise or pervasive transcription. Accordingly, conserved promoter motifs were found upstream indicating proper TSSs. More than 50% of 5,567 canonical and between 32% and 47% of 10,355 non-canonical TSS were differentially expressed in different media and growth phases, providing evidence for a potential biological function also of non-canonical TSS. Thus, reproducible and environmentally regulated expression suggests that a substantial number of the non-canonical TSSs may be of unknown function rather than being the result of noise or pervasive transcription.


Assuntos
Escherichia coli Êntero-Hemorrágica , Escherichia coli O157 , Escherichia coli O157/genética , Sítio de Iniciação de Transcrição , Ciclo Celular , Meios de Cultura
15.
J Biol Chem ; 299(9): 105130, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37543366

RESUMO

Long noncoding RNAs (lncRNAs) are increasingly being recognized as modulators in various biological processes. However, due to their low expression, their systematic characterization is difficult to determine. Here, we performed transcript annotation by a newly developed computational pipeline, termed RNA-seq and small RNA-seq combined strategy (RSCS), in a wide variety of cellular contexts. Thousands of high-confidence potential novel transcripts were identified by the RSCS, and the reliability of the transcriptome was verified by analysis of transcript structure, base composition, and sequence complexity. Evidenced by the length comparison, the frequency of the core promoter and the polyadenylation signal motifs, and the locations of transcription start and end sites, the transcripts appear to be full length. Furthermore, taking advantage of our strategy, we identified a large number of endogenous retrovirus-associated lncRNAs, and a novel endogenous retrovirus-lncRNA that was functionally involved in control of Yap1 expression and essential for early embryogenesis was identified. In summary, the RSCS can generate a more complete and precise transcriptome, and our findings greatly expanded the transcriptome annotation for the mammalian community.


Assuntos
Anotação de Sequência Molecular , RNA Longo não Codificante , RNA-Seq , Animais , Desenvolvimento Embrionário/genética , Mamíferos/embriologia , Mamíferos/genética , Anotação de Sequência Molecular/métodos , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Retroviridae/genética , RNA Longo não Codificante/genética , RNA-Seq/métodos , Sítio de Iniciação de Transcrição , Transcriptoma/genética , Proteínas de Sinalização YAP/genética , Proteínas de Sinalização YAP/metabolismo
16.
Cells ; 12(13)2023 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-37443771

RESUMO

Identifying tissue-specific molecular signatures of active regulatory elements is critical to understanding gene regulatory mechanisms. In this study, transcription start sites (TSS) and enhancers were identified using Cap analysis of gene expression (CAGE) across endometrial stromal cell (ESC) samples obtained from women with (n = 4) and without endometriosis (n = 4). ESC TSSs and enhancers were compared to those reported in other tissue and cell types in FANTOM5 and were integrated with RNA-seq and ATAC-seq data from the same samples for regulatory activity and network analyses. CAGE tag count differences between women with and without endometriosis were statistically tested and tags within close proximity to genetic variants associated with endometriosis risk were identified. Over 90% of tag clusters mapping to promoters were observed in cells and tissues in FANTOM5. However, some potential cell-type-specific promoters and enhancers were also observed. Regions of open chromatin identified using ATAC-seq provided further evidence of the active transcriptional regions identified by CAGE. Despite the small sample number, there was evidence of differences associated with endometriosis at 210 consensus clusters, including IGFBP5, CALD1 and OXTR. ESC TSSs were also located within loci associated with endometriosis risk from genome-wide association studies. This study provides novel evidence of transcriptional differences in endometrial stromal cells associated with endometriosis and provides a valuable cell-type specific resource of active TSSs and enhancers in endometrial stromal cells.


Assuntos
Endometriose , Estudo de Associação Genômica Ampla , Humanos , Feminino , Sítio de Iniciação de Transcrição , Endometriose/genética , Regiões Promotoras Genéticas/genética , Regulação da Expressão Gênica
17.
Genome Biol ; 24(1): 165, 2023 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-37438847

RESUMO

Detecting allelic imbalance at the isoform level requires accounting for inferential uncertainty, caused by multi-mapping of RNA-seq reads. Our proposed method, SEESAW, uses Salmon and Swish to offer analysis at various levels of resolution, including gene, isoform, and aggregating isoforms to groups by transcription start site. The aggregation strategies strengthen the signal for transcripts with high uncertainty. The SEESAW suite of methods is shown to have higher power than other allelic imbalance methods when there is isoform-level allelic imbalance. We also introduce a new test for detecting imbalance that varies across a covariate, such as time.


Assuntos
Desequilíbrio Alélico , Incerteza , Isoformas de Proteínas/genética , RNA-Seq , Sítio de Iniciação de Transcrição
18.
Sci Rep ; 13(1): 10835, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37407625

RESUMO

The prevalent one-dimensional alignment of genomic signals to a reference landmark is a cornerstone of current methods to study transcription and its DNA-dependent processes but it is prone to mask potential relations among multiple DNA elements. We developed a systematic approach to align genomic signals to multiple locations simultaneously by expanding the dimensionality of the genomic-coordinate space. We analyzed transcription in human and uncovered a complex dependence on the relative position of neighboring transcription start sites (TSSs) that is consistently conserved among cell types. The dependence ranges from enhancement to suppression of transcription depending on the relative distances to the TSSs, their intragenic position, and the transcriptional activity of the gene. Our results reveal a conserved hierarchy of alternative TSS usage within a previously unrecognized level of genomic organization and provide a general methodology to analyze complex functional relationships among multiple types of DNA elements.


Assuntos
DNA , Genômica , Humanos , Sítio de Iniciação de Transcrição , Regiões Promotoras Genéticas , Genômica/métodos
19.
Nucleic Acids Res ; 51(15): e80, 2023 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-37403796

RESUMO

Cis-regulatory elements (CREs) can be classified by the shapes of their transcription start site (TSS) profiles, which are indicative of distinct regulatory mechanisms. Massively parallel reporter assays (MPRAs) are increasingly being used to study CRE regulatory mechanisms, yet the degree to which MPRAs replicate individual endogenous TSS profiles has not been determined. Here, we present a new low-input MPRA protocol (TSS-MPRA) that enables measuring TSS profiles of episomal reporters as well as after lentiviral reporter chromatinization. To sensitively compare MPRA and endogenous TSS profiles, we developed a novel dissimilarity scoring algorithm (WIP score) that outperforms the frequently used earth mover's distance on experimental data. Using TSS-MPRA and WIP scoring on 500 unique reporter inserts, we found that short (153 bp) MPRA promoter inserts replicate the endogenous TSS patterns of ∼60% of promoters. Lentiviral reporter chromatinization did not improve fidelity of TSS-MPRA initiation patterns, and increasing insert size frequently led to activation of extraneous TSS in the MPRA that are not active in vivo. We discuss the implications of our findings, which highlight important caveats when using MPRAs to study transcription mechanisms. Finally, we illustrate how TSS-MPRA and WIP scoring can provide novel insights into the impact of transcription factor motif mutations and genetic variants on TSS patterns and transcription levels.


Assuntos
Regulação da Expressão Gênica , Sequências Reguladoras de Ácido Nucleico , Sítio de Iniciação de Transcrição , Regiões Promotoras Genéticas , Fatores de Transcrição/genética , Sequenciamento de Nucleotídeos em Larga Escala
20.
Comput Biol Chem ; 105: 107904, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37327560

RESUMO

MOTIVATION: Computational promoter prediction (CPP) tools designed to classify prokaryotic promoter regions usually assume that a transcription start site (TSS) is located at a predefined position within each promoter region. Such CPP tools are sensitive to any positional shifting of the TSS in a windowed region, and they are unsuitable for determining the boundaries of prokaryotic promoters. RESULTS: TSSUNet-MB is a deep learning model developed to identify the TSSs of σ70 promoters. Mononucleotide and bendability were used to encode input sequences. TSSUNet-MB outperforms other CPP tools when assessed using the sequences obtained from the neighborhood of real promoters. TSSUNet-MB achieved a sensitivity of 0.839 and specificity of 0.768 on sliding sequences, while other CPP tool cannot maintain both sensitivities and specificities in a compatible range. Furthermore, TSSUNet-MB can precisely predict the TSS position of σ70 promoter-containing regions with a 10-base accuracy of 77.6%. By leveraging the sliding window scanning approach, we further computed the confidence score of each predicted TSS, which allows for more accurately identifying TSS locations. Our results suggest that TSSUNet-MB is a robust tool for finding σ70 promoters and identifying TSSs.


Assuntos
Escherichia coli , Sítio de Iniciação de Transcrição , Regiões Promotoras Genéticas/genética , Escherichia coli/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...